首页> 外文OA文献 >A method for the online construction of the set of states of a Markov Decision Process using Answer Set Programming

【2h】

A method for the online construction of the set of states of a Markov Decision Process using Answer Set Programming

机译：一种马尔可夫状态集在线构造方法使用答案集编程的决策过程

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Non-stationary domains, that change in unpredicted ways, are a challenge foragents searching for optimal policies in sequential decision-making problems.This paper presents a combination of Markov Decision Processes (MDP) withAnswer Set Programming (ASP), named {\em Online ASP for MDP} (oASP(MDP)), whichis a method capable of constructing the set of domain states while the agentinteracts with a changing environment. oASP(MDP) updates previously obtainedpolicies, learnt by means of Reinforcement Learning (RL), using rules thatrepresent the domain changes observed by the agent. These rules represent a setof domain constraints that are processed as ASP programs reducing the searchspace. Results show that oASP(MDP) is capable of finding solutions for problemsin non-stationary domains without interfering with the action-value functionapproximation process.

机译：非平稳域的变化方式无法预测，这对于代理商在顺序决策问题中寻找最优策略是一个挑战。本文提出了马尔可夫决策过程（MDP）与答案集编程（ASP）的结合，名为{\ em Online用于MDP的ASP}（oASP（MDP）），这是一种能够在代理与不断变化的环境交互时构造域状态集的方法。 oASP（MDP）使用代表代理观察到的域更改的规则更新通过强化学习（RL）学习的先前获得的策略。这些规则表示一组域约束，这些域约束作为ASP程序进行处理以减少搜索空间。结果表明，oASP（MDP）能够在非平稳域中找到问题的解决方案，而不会干扰动作值函数的逼近过程。

著录项

作者
Ferreira, Leonardo A.; Bianchi, Reinaldo A. C.; Santos, Paulo E.; de Mantaras, Ramon Lopez;
展开▼
作者单位

展开▼
年度 2017
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. Answer set programming for non-stationary Markov decision processes [J] . Computing reviews . 2018,第6期

机译：非平稳马尔可夫决策过程的答案集编程
2. Answer set programming for non-stationary Markov decision processes [J] . Ferreira Leonardo A., Bianchi Reinaldo A. C., Santos Paulo E., Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies . 2017,第4期

机译：回答设置非静止马尔可夫决策过程的编程
3. Methods and Methodologies for Developing Answer-Set Programs - Project Description [J] . Johannes Oetsch, J{o}rg P{u}hrer, Hans Tompits LIPIcs : Leibniz International Proceedings in Informatics . 2010,第2期

机译：开发答案集程序的方法和方法-项目说明
4. A Method for the Online Construction of the Set of States of a Markov Decision Process Using Answer Set Programming [C] . Leonardo Anjoletto Ferreira, Reinaldo A. C. Bianchi, Paulo E. Santos, International conference on industrial engineering and other applications of applied intelligent systems . 2018

机译：基于答案集编程的马尔可夫决策过程状态集在线构建方法
5. Computing a Probabilistic Extension of Answer Set Program Language Using ASP and Markov Logic Solvers [D] . Talsania, Samidh. 2017

机译：使用ASP和Markov Logic Solvers计算答案设置程序语言的概率扩展
6. Evolving Robust Policy Coverage Sets in Multi-Objective Markov Decision Processes Through Intrinsically Motivated Self-Play [O] . Sherif Abdelfattah, Kathryn Kasmarik, Jiankun Hu 2018

机译：通过内在动机的自我博弈在多目标马尔可夫决策过程中发展稳健的政策覆盖范围
7. Answer set programming for non-stationary Markov decision processes [O] . Ferreira, Leonardo A., Bianchi, Reinaldo, Santos, Paulo E., 2018

机译：非平稳马尔可夫决策过程的答案集编程

A method for the online construction of the set of states of a Markov Decision Process using Answer Set Programming

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅